QKD: a million signal task 
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Abstract. I review the ideas and main results in the derivation of security bounds 
in quantum key distribution for keys of finite length. In particular, all the detailed 
studies on specific protocols and implementations indicate that no secret key can 
be extracted if the number of processed signals per run is smaller than 10 5 — 10 6 . 
I show how these numbers can be recovered from very basic estimates. 
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1. Introduction 

In the very first proof of unconditional security of quantum key distribution (QKD), 
Mayers already stressed the need to obtain finite-key bounds and paved the path for this 
task [1|. The complex development of QKD, which has been reviewed elsewhere [2|, 
explains why this task was finally achieved only recently. 

This text is a concise presentation of finite-key security proofs, written for readers 
that are already familiar with the main notions of QKD. I use the approach developed 
by Renato Renner in his thesis [3| and applied later to the specific study of finite-key 
effects M4I5I61 . Independently, and actually some months earlier than Renner and myself, 
Hayashi has also developed finite-key analysis |7 8 9|. A recent work by Fung, Ma and 
Chau tackles the problem from yet another approach IflOl . It must be stressed that all 
these approaches are recognized to be ultimately based on the same definition of security 
and lead to the same order of magnitude for the finite-key corrections, although there 
may be differences in details. 



2. Finite-key corrections 

In a QKD protocol, Alice and Bob have to infer the information that could have leaked 
to the eavesdropper Eve on the basis of some measured parameters: error rates, detection 
rates, and others. These parameters and the way of measuring them define the protocol. 
Let us refer to them as V measured value of the parameters. 

Let N be total number of signals exchanged by Alice and Bob in a run of key ex- 
change, n the number of signals kept for the key (i.e. the length of the raw key) and m 
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the number of signals used for parameter estimation. The length of the secret key to be 
extracted is denoted by I, The usually quoted number is the secret key fraction r = 4j . 

It is important to note from the start that a run of key exchange is defined as the 
number of signals that shall be processed together: obviously, the duration of the key 
exchange cannot increase its security. In all this text, I focus on one-way post-processing 
without pre-processing, so specifically a run is defined by the size of the blocks on which 
privacy amplification is performed. 

In the asymptotic limit, the expression of r is by now well-known: 

I 

roo = lim — = minH(A\E) - H(A\B) (1) 

N^oo N E\V 

It has an intuitive meaning: the fraction that can be extracted is the uncertainty of Eve 
minus the uncertainty of Bob (or equivalently, the information of Bob minus the infor- 
mation of Eve) on Alice's key. 

The finite-key bound, to be proved below, reads 



£ n 
TN= N = N 



min H(A\E) - A(n) - leak E c/n 

B|V±AV 



(2) 



Four differences between (Q]i and (0 are worth stressing: 



1. j*: in the asymptotic case, the protocol can be adapted so that almost all the sig- 
nals are used for the raw key and only a negligible fraction is used for parameter 
estimation [ 11 1. This is of course no longer the case in the finite-key regime: one 
needs to devote enough many signals to parameter estimation in order to have 
good statistics. 

2. V ± AV: the statistical estimates of the parameters come with fluctuations for 
finite sampling. 

3. leak^c: this is the information that leaks out to Eve during error correction; in 
the asymptotic case, it is given by the Shannon bound H (A\B), but for practical 
codes on finite samples the Shannon bound may not be reached. 

4. A(n): while the three previous items are pretty obvious and and had in fact been 
anticipated in many works, both theoretical and experimental, this term is the 
challenging one. It comes as an overhead to privacy amplification: the task itself 
cannot be carried out perfectly on finite samples. 

In summary, the theoretical challenge of finite-key analysis mainly consists of obtaining 
an expression for A(n). As we shall show, once this expression is obtained, one can 
study which of the corrections has the largest effect; it will turn out that the most dra- 
matic correction comes from the "obvious" need to take into account Ay, the statistical 
fluctuations in the estimate of the parameters. 



3. Derivation of the finite-key bound 

3.1. Scenario 

The following equation summarizes the steps of a QKD exchange: 
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In words: on the quantum channel, Eve interacts with the TV signals flying between 
Alice and Bob; by giving Eve a purification of the whole state, we give her the max- 
imal possible information. When Alice and Bob measure, the total state becomes a 
classical-classical-quantum (ccq) state. After error correction (EC), Alice's and Bob's 
classical lists are equal, but Eve has gained some additional classical information: 
pj = ~}2 h p{h\a)(f^r® \C(b — > a))(C(b a)\. After privacy amplification (PA), the lists 
of Alice and Bob are shorter, but still equal and now drawn from a completely random 
distribution, on which Eve has no information. 

This is for the ideal case. In order to estimate i, the first step consists in defining a 
security parameter. The most convenient choice, actually the only one with the suitable 
properties known to date (see Q for a discussion), is the probability that the processed 
state differs from the ideal one: 

e = l^APKE- Pu ® Pe\ ■ (3) 

So, now we choose a value for e (say 10~ 9 , i.e. I want at most one run in one billion to 
go wrong), our data processing defines N: how to get the corresponding il This is the 
subject of the next paragraph. 



3.2. Renner's version of Murphy's law 



The recipe is: quantify everything that can go wrong, going backwards, i.e. starting from 
PA and going up to parameter estimation. Here is what it gives in detail: 

1 . First step: Privacy amplification. The probability that PA fails even if everything 
has been perfectly carried out is 

epA = 2-H^(^l f M). (4) 

As usual, for the proof and the exact definitions I refer to the original papers 
quoted above. Let us however get an intuition of this result. The symbol £ repre- 
sent all that Eve has learned, both from attacking the quantum channel and from 
listening to error correction. Alice's information is denoted by A n to remind that 
it is a sequence of n bits. The main object introduced here is Hf nin (.\.) is the 
quantum conditional smooth min-entropy, with e the smoothing parameter. It is 
not astonishing that a min-entropy appears here: as we said above, the final key 
should be randomly distributed given Eve's knowledge. In classical information 
theory, the min-entropy precisely quantifies the fraction of completely random 
bits one can extract from a given list of partially random bits. The challenge con- 
sisted in defining the quantity that plays the same role, in the case where an ad- 



versary has quantum information. This was done in Renner's thesis j3j. So, in a 
sense, (O can be seen as a definition of smooth min-entropy, or better, as one of 
the desiderata: the smooth min-entropy must be such that (01 holds. 
Now, for the right-hand side of © to be bounded, one must have 



It is crucial to understand that this is already the desired bound for t. Unfortu- 
nately, there is no easy way of computing H m i n (A n \£): the next steps will be 
merely bound on this last quantity, reaching to an ultimately computable expres- 
sion. 

2. Second step: error correction. First, we have to add an eec to the failure proba- 
bility e. This measures the probability that the EC procedure was apparently suc- 
cessful but some uncorrected errors remair@. Moreover, as we said, during EC, 
additional data C has been provided to Eve, giving an amount of information 
leak^c"; by using a chain rule, one can split this information from the one ac- 
quired in attacking the quantum channel, denoted by E. So we obtain the expres- 
sion 



Theory predicts leak^c ~ f H${A n \B n )+\og but ultimately this is number 
of bits really exchanged during EC: it does not need theoretical modeling (unless 
one is working on designing a better EC code), just pick it from the real run of 
the EC code. 

3. Third step: Eve's attack on the quantum channel. Now we are back at the 
level of quantum interaction and we have to compute H m i n (A n \E): how much 
information Eve might have acquired, given the observed disturbances V. Under 
the assumption of collective attacks, one finds 



The assumption of collective attacks is not restrictive for the Bennett-Brassard 
1984 (BB84) protocol and many others, under the validity of the squashing con- 
dition^. In order to remove this assumption, one could resort to the suitable De 
Finetti theorem, but the bound scales very badly. A much more promising ap- 
proach is based on Ref. ifPfl . but to my knowledge it has not been applied yet in 
the context of finite-keys. 
4. Fourth step: Optimize over Eve's attacks. Since we don't know which attack 
Eve has actually performed, H(A\E) must be computed for the best possible 
attack compatible with the measured parameters V. But these parameters have 

2 This may sound cumbersome at first reading, but it is actually clear. Any EC procedure has a possible red- 
flag outcome: "Sorry, the procedure was not successful". In this case, one just discards the raw key, without 
compromising the security. Problems arise only when there are still uncorrected errors left but the red flag is 
not raised. 

3 This condition means that one can prove the security of a protocol on the level of qubits, even if physically 
light fields are infinitely dimensional systems. In other words, one can "squash" the meaningful degrees of 
freedom down to qubits. This statement has been rigorously proved for the BB84 protocol f 121131 . 




(5) 
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been measured on a finite sample: they are subject to fluctuations AV. Specifi- 
cally, if a value V m has been obtained after averaging on m samples, one hafl 



/ ln(l/ep B )+dln(m + l) 
V 2m 

with epe the probability that the fluctuation is actually larger than AV^ and d the 
minimal number of outcomes of the POVM that is used to estimate V (typically, 
d = 2 for errors on bits: one has to estimate the probabilities of the event "bits 
equal" and of the event "bits different"). 

Putting everything together, one obtains (O with 



n epA V n 
Since failure probabilities add, the total error is 

e = <^pa + e + n PE epE + £ec (10) 

with npE the number of parameters to be estimated. 

This concludes our overview of the meaning and derivation of the finite-key bound 
(0. As we mentioned, there are alternative approaches [7 8 9 10]: as they stand, these 
approaches are specifically tailored for the BB84 protocol; but there is no reason to doubt 
that they can be adapted to other protocols as well. 



4. Rapid estimates of finite key effects 

To date, the bound <(2J has been applied to the BB84 protocol, both in the ideal single 
photon case [4| and in more realistic scenarios J6|, as well as to ideal implementations of 
the six-state protocol [4] and of a modified Ekert protocol [5 1. In spite of the differences 
and for a wide range of reasonable values of parameters, a common feature is observed: 
tm becomes larger than only for N « 10 5 — 10 6 . Here, we show with a simple estimate 
that this is indeed the case. 

We suppose that all epsilons are of the order 10~ 3 « 2~ 10 . This is quite a poor 
requirement: it means that the key exchange may fail once every thousand runs only; 
but anyway, one would only get a small overhead by choosing 1CP 9 instead, because the 
dependence in the epsilons is logarithmic. 

Another useful approximation is m » n w N/2. Close to the critical point where rpj 
passes from being to be positive, this is always observed in exact numerical estimates 
of (fJJ, and it is quite reasonable: in the critical regime, one tries to devote enough many 
signals to the parameter estimation, without compromising the key. 

Under these two assumption, we have the following approximate values for (0 and 
(H) respectively: 



« / \ 40 n 12 . 9 + 2MN) 



4 Note added: the following equation is imprecise for d > 2: see a better discussion in L. Sheridan and V. 
Scarani, Phys. Rev. A 82, 030301(R) (2010). 



Let's consider two case studies: 

1. Case study: effect of A(n). Suppose we are in a regime of parameters such 
that Too — 0.1. Neglecting the parameter fluctuations AV, from (0 one has 
i~N ~ ^oo — This quantity is positive for N w 10 5 . Of course, the estimate 
becomes worse, the smaller is (for instance, as a function of the distance in 
a practical implementation). In summary, the finite-sample corrections to privacy 
amplification force a lower bound of N « 10 5 signals per run. 

2. Case study: effect of AV. Consider V as the most typical parameter in QKD, 
namely an error rate. Because of the quality of optical setups, errors normally 
range around 1 — 2%. At which point tn is zero depends on many variables and 
of course on the protocol, so we cannot be very specific here. But if one requests 
that error rates are known to a precision AV « 0.5%, one needs N ~ 10 6 signals 
per run. 

From these case studies, we see how the estimate N > 10 5 — 10 6 arise from quite 
trivial estimates of the finite-key effects, independently of the details of the protocols. 

5. Conclusion 

Claims of security of QKD could not be complete without integrating the effects of finite 
sampling, or finite-key in short. Thanks to the work of several authors, this task has now 
been accomplished. One of the most striking features is the fact that, to have any final 
key whatsoever, approximately one million signals must be processed in each run — and 
the more, the better. In this text, after reviewing the main ingredients of the finite-key 
bounds, I have stressed that the lower bounds on the number of signals can be obtained 
from very simple estimates and are therefore rather robust. While the performances of 
QKD may still be improved in many aspects, it seems unavoidable that QKD will always 
be a million-signal task; or, to put it more positively, I would consider it as a milestone 
if someone could find a way of improving significantly on these bounds. 
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